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ERDOS AND ARITHMETIC PROGRESSIONS 


W.T. GOWERS 


Abstract. Two of Erdos’s most famous conjectures concern arithmetic progressions. In 
this paper we discuss some of the progress that has been made on them. 


1. Introduction 

Possibly the best known of all of Erdos’s many conjectnres is the following striking 
statement. 

Conjecture 1.1. Let A be a set of positive integers such that Then A 

contains arbitrarily long arithmetic progressions. 

This conjectnre is still wide open. Indeed, it is not even known whether A must contain 
an arithmetic progression of length 3. 

There is another conjecture of Erdos about arithmetic progressions. It is not as famous 
as the hrst, but it is still well known and extremely interesting. It is sometimes referred to 
as Erdos’s discrepancy problem. 

Conjecture 1.2. Let ei, 62 , es,... be a seguence taking values in the set {—1,1}. Then for 
every constant C there exist positive integers n and d such that \ X]m=i ^md\ > C. 

The purpose of this paper is to say a little bit about the two conjectures and to discuss 
some known results and related problems. 

2. Arithmetic progressions in sparse sets 

What does it tell us about a set A if Yln&A diverges? Clearly it tells us that in some 
sense A is not too small, since the larger it is, the more likely the sum of its reciprocals 
is to diverge. A rough interpretation of the condition turns out to be that the density 
5{n) = n~^\A fl {1,2,... ,n}| decreases not too much faster than (logn)“^. One way of 
seeing this is as follows. Writing for the characteristic function of A, we have the trivial 
identity 

= n5{n) — {n — l)(5(n — 1), 
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from which (if we adopt the convention that <5(0) = 0) it follows that 

= ^((5(n) — 6{n — 1) + S{n — l)/n) = — 1)/'R- 

uGA n n n 

Thus, if the density decreases like (logri)”^ then we get a sum like ^^1/nlogn, which 
diverges, while if it decreases like, say, (logn)“^(log logn)“^, then we get a convergent sum. 

Of course, the density does not have to decrease smoothly in this way, but this neverthe¬ 
less gives a good general picture of what the conjecture is saying. In particular, the simple 
calculation just given tells us that if ~ then there must be inhnitely many 

n for which S(n) > (logn)~^(loglogn)~^, so to prove Erdos’s conjecture it is sufficient to 
prove the following statement. 

Conjecture 2.1. For every k there exists n such that if A is any subset o/{!,..., n} 
of cardinality at least n/log n(log log then A contains an arithmetic progression of 
length k. 

It is also not hard to show that to disprove Erdos’s conjecture, it would be sufficient to 
show that for every k and every sufficiently large n there exists a subset A C {1 ,... ,n} 
of cardinality at least n/ logn that does not contain an arithmetic progression of length k. 
To do this, for each sufficiently large r let Ar be a subset of {2^ -|- 1,..., 2^+^} of size at 
least cr“^2^ that contains no arithmetic progression of length k and let A be the infinite 
set As U As +2 U As +4 U ... for a sufficiently large s. Then for every sufficiently large n we 
have d{n) > c'(logn)“^ and A contains no arithmetic progression of length k. 

Thus, Erdos’s conjecture is basically addressing the following problem, and suggesting 
an approximate answer. 

Problem 2.2. Let k and n he positive integers. How large does a subset A C {1, 2,... ,n} 
have to be to guarantee that it contains an arithmetic progression of length k ? 

The suggested answer is that a cardinality of somewhere around n/ logn should be enough. 

A natural starting point would be to prove any bound of the form o(n). This gives us 
another famous conjecture of Erdos, made with Paul Turan in 1936 [10]. 

Conjecture 2.3. For every positive integer k and every <5 > 0 there exists n such that 
every subset A C {1, 2,..., n} of cardinality at least 5n contains an arithmetic progression 
of length k. 

Even this much weaker conjecture turned out to be very hard, and very interesting 
indeed: it can be seen as having given rise to several different branches of mathematics. 
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The first progress on the Erdos-Turan conjecture was due to Roth, who proved in 1953 
that it is true when k = 3 [31]. Roth’s proof, which used Fourier analysis, showed that 
6 could be taken to be C*/log log n for an absolute constant C. The problem for longer 
progressions turned out to be much harder, and it was not until 1969 that there was further 
progress, when Szemeredi proved the result for k = A [36], this time with a bound for 6 
that was too weak to be worth stating explicitly. And a few years later (the paper was 
published in 1975), Szemeredi managed to prove the general case [37]. 

2.1. Other proofs of Szemeredi’s theorem. This result was hailed at the time and is 
still regarded as one of the great mathematical results of the second half of the twenti¬ 
eth century, but it was by no means the end of the story: over the last four decades its 
signihcance has steadily grown. In this respect, the Erdos-Turan conjecture is like many 
conjectures of Erdos. Initially it seems like an amusing puzzle, but the more you think 
about it, the more you come to understand that the “amusing puzzle” is a brilliant distilla¬ 
tion of a much more fundamental mathematical difficulty. There are few direct applications 
of Szemeredi’s theorem (though they do exist), but an enormous number of applications 
of the methods that Szemeredi developed to prove the theorem, and in particular of his 
famous regularity lemma. 

Since then, there have been several other proofs of the theorem, which have also intro¬ 
duced ideas with applications that go well beyond Szemeredi’s theorem itself. In 1977, 
Furstenberg pioneered an ergodic-theoretic approach [11], giving a new proof of the theo¬ 
rem and developing a method that went on to yield the hrst proofs of many generalizations, 
of which we mention three notable ones. 

The hrst is a natural multidimensional version of Szemeredi’s theorem, due to Fursten¬ 
berg and Katznelson [12]. 

Theorem 2.4. For every 5 > 0, every positive integer d and every subset K <Z T/' there 
exists n such that every subset A C {1,... ,n}‘^ of size at least 6n‘^ contains a homothetic 
copy of K: that is, a set of the form aK + b for some positive integer a and some b G Z'’*. 

Next, we have the “density Hales-Jewett theorem”, also due to Furstenberg and Katznel¬ 
son [13]. For this we need a dehnition. If x is a point in {1,..., A:}" and F is a subset of 
{1,2..., n}, then for each 1 < j < /c let x © jE be the point y E {I,, /c}" such that 
yi = j for every i E A and yi = x* otherwise. A combinatorial line in (1,..., A:}” is a set 
of points of the form {x © jE : j = 1,..., k}. 
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Theorem 2.5. For every <5 > 0 and every k there exists n such that every subset A C 
{1 ,..., k}"' of cardinality at least Sk"' contains a combinatorial line. 

Finally, the Bergelson-Leibman theorem [2] is the following remarkable “polynomial 
version” of Szemeredi’s theorem. 

Theorem 2.6. For every 5 > 0 and every sequence Pi,... ,Pk of polynomials with integer 
coefficients and no constant term there exists n such that every subset A (Z {1,2,... ,n} of 
cardinality at least 6n contains a subset of the form [a + Pi{d), a + P 2 {d),..., a + Pk{d)} 
with d 7 ^ 0. 

If we take Pi{d) to be (i — l)d, then we recover Szemeredi’s theorem, but this result is 
considerably more general. For example, amongst many other things it implies that in 
Szemeredi’s theorem we can ask for the common difference of the arithmetic progression 
we obtain to be a perfect cube. 

Another approach to Szemeredi’s theorem was discovered approximately twenty years 
later by the author [16, 17]. One of the reasons that Roth’s proof for progressions of length 
3 was not quickly followed by a proof of the general case was that while the number of 
arithmetic progressions of length 3 in a set can be expressed very nicely in terms of Fourier 
coefficients, there is no useful Fourier expression for the number of arithmetic progressions 
of length 4 (or more). The proofs in [16, 17] replaced the trigonometric functions that Roth 
used by polynomial phase functions (that is, functions of the form exp(27rzp(a:)) for some 
polynomial p) restricted to arithmetic progressions. This strongly suggested that there 
should be a kind of “higher-order Fourier analysis”, and, in a major recent achievement, 
such a theory was worked out by Green, Tao and Ziegler [22] (see also [20, 4]. Their inverse 
theorem for the uniformity norms had a very important application that we shall describe 
briefly later. 

A fourth approach to the theorem had its roots in a fascinating argument of Ruzsa and 
Szemeredi [32], who used Szemeredi’s regularity lemma to prove the following result, which 
is now known as the triangle removal lemma. 

Theorem 2.7. For every e > 0 there exists <5 > 0 such that if G is any graph with n 
vertices and at most 6n^ triangles, then there is a triangle-free graph that differs from G 
by at most en^ eges. 

By applying the triangle removal lemma to a suitably chosen graph, one can deduce 
Roth’s theorem (with a much worse bound). 
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It is natural to wonder whether this idea can be generalized to give a proof of the gen¬ 
eral case of Szemeredi’s theorem. This thought led Rodl to formulate an approach to 
the theorem, in which the regularity lemma was generalized from graphs to hypergraphs. 
The generalization is not straightforward to state, and proving both it and an associated 
“counting lemma” turned out to be hard. Frankl and Rodl proved a hypergraph regularity 
lemma in 1992 [14] and in 2002 managed to use it to prove Szemeredi’s theorem for pro¬ 
gressions of length 4 [15]. The general case was proved by this method in independent work 
of Nagle, Rodl and Schacht [27] and the author [18]. (In the latter proof the formulation 
of the hypergraph regularity lemma was different, which made it harder to prove but made 
the counting lemma easier to prove.) Hypergraph regularity has gone on to have several 
other applications. 

An important development in our understanding of the regularity lemma came with work 
of Lovasz and others on graph limits. Loosely speaking, with the help of the regularity 
lemma one can show that very large graphs look like measurable functions from [0,1]^ 
to [0,1]. In a way this is not too surprising, because the regularity lemma allows one to 
approximate any graph with just a bounded amount of information about densities between 
subsets. What is more surprising, however, is that the graph-limits point of view leads to 
a simpler proof of the regularity lemma itself [24]: for the limiting arguments one can use 
a weaker regularity lemma, and once one has passed to a measurable function on [0,1]^, 
one has a limit of step functions, which implies that if one partitions into a very hne grid, 
then the function will be approximately constant on most squares. 

Once one is given the statement of Szemeredi’s regularity lemma and the basic idea of 
the standard proof, working out the details is not especially hard to begin with. However, 
the limits approach generalizes to hypergraphs [9], where proving corresponding results 
is much harder, and gives rise to similar simplihcations. The resulting hypergraph-limits 
approach to Szemeredi’s theorem has a strong claim to be the simplest known proof of the 
theorem. More generally, graph and hypergraph limits have become a very active area of 
research with several other applications. 

We briefly mention one other candidate for the simplest known proof of Szemeredi’s 
theorem, which is a combinatorial proof of the density Hales-Jewett theorem, discovered 
by a “massive online collaboration” [28]. It is easy to see that the density Hales-Jewett 
theorem implies Szemeredi’s theorem: one just needs to interpret the points in {1,..., /c}"' 
as base-fc representations of integers, and then every combinatorial hne is an arithmetic 
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progression of length k (bnt not vice versa). Recently, this proof has been simplihed yet 
fnrther [8]. 

2 .2. Quantitative considerations. As we saw earlier, Conjectnre 1.1 is ronghly saying 
that a density of (logn)“^ is enongh to guarantee an arithmetic progression. But what is 
special about this bound? Indeed, is it special? 

There are two sensible answers to this question: yes and no. The reason the bound 
is special, and the reason that Erdos asked the question, is that the primes have density 
around (logn)“^ in the hrst n integers. One of Erdos’s formative mathematical experiences 
was proving for himself that the sum of the reciprocals of the primes diverges, and it is 
clear that his main motivation for the sum-of-reciprocals conjecture was that it would 
imply that the prime numbers contain arbitrarily long arithmetic progressions. This would 
be an example of a result of a kind that Erdos particularly liked: a result that appears to 
be number-theoretic but turns out to be true for purely combinatorial reasons. 

It would have been fascinating to know how Erdos would have reacted to the proof by 
Green and Tao [19] that the primes do indeed contain arbitrarily long arithmetic progres¬ 
sions. In fact. Green and Tao proved the following stronger result. 

Theorem 2.8. For every <5 > 0 and every k there exists n such that if A is any set of 
at least 6n/logn primes between 1 and n, then A contains an arithmetic progression of 
length k. 

That is, not only do the primes contain arbitrarily long arithmetic progressions, but so 
does any subset of the primes of positive relative density. (Of course, this too is implied 
by the sum-of-reciprocals conjecture.) 

The proof of this celebrated result did not go according to Erdos’s plan, in that it 
made signihcant use of distribution properties of the primes. However, despite this, it 
would almost certainly have appealed to Erdos’s love of combinatorial arguments, since 
the main new ingredient in the proof was in a sense “purely combinatorial”: they proved a 
“relative version” of Szemeredi’s theorem, showing that a set A that is a relatively dense 
subset of a set B must contain an arithmetic progression of length k, provided that B is 
sufhciently large and sufficiently “pseudorandom” in a technical sense that they dehned. 
(The result they stated and used was actually more general than this: B was replaced by 
a “pseudorandom measure”.) In order to prove this result, they used Szemeredi’s theorem 
as well as techniques from several of the proofs of the theorem. Thus, the work on the 
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Erdos-Turan conjecture did in the end result in a solution to the problem that so fascinated 
Erdos. 

Green and Tao followed this theorem with a project to obtain asymptotic bounds for the 
number of arithmetic progressions of length k (and many other conhgurations) in the primes 
up to n. Over several years, they published a sequence of major papers, culminating in a 
proof, with Tamar Ziegler, of the inverse theorem for the uniformity norms [22], mentioned 
earlier, at which point the project was completed. 

2.2.1. How natural is Erdos’s conjecture? The fact that Erdos’s conjecture implies an ex¬ 
tremely striking result about the primes is not really evidence that the correct bound in 
Szemeredi’s theorem is anywhere near 6 = (logn)”^. Obtaining such a bound would be 
wonderful, but there is no strong reason to suppose that it would be the last word on the 
subject. 

In particular, the best known lower bound for Szemeredi’s theorem is far smaller than 
(logn)“^. It comes from a construction of Behrend in 1946 [1]. Behrend started from the 
observation that the surface of a sphere contains no three points in a line, and in particular 
no three points such that one is the midpoint of the other two. The argument proceeds 
as follows. For suitable integers m and d, to be optimized at the end of the argument, 
one shows by the pigeonhole principle that there exists r such that the sphere of radius 
r contains many points in the grid {1,... Next, one embeds that grid “isomorphi- 

cally” into the set { 1 , 2 ,..., ( 2 m)'’*} by thinking of the points in { 1 ,..., m}'’* as base- 2 m 
representations of integers. The main property of this “isomorphism” is that it does not 
create any arithmetic progressions of length 3 that were not present before. Finally, one 
maximizes the number of points in the spherical surface subject to the constraint that 
(2m)'’* = n. The resulting bound is <5 = exp(—c\/logn). 

This bound helps to explain why it is so hard to determine optimal bounds for Sze¬ 
meredi’s theorem, even when the progressions have length 3. On a hrst acquaintance with 
the problem, it is natural to conjecture that the extremal example would be given by 
a simple probabilistic construction. If that were the case, then there would be hope of 
proving that that construction was best possible by showing that “quasirandom sets are 
best”. An approach like this works, for example, if one wishes to minimize, for a given 
cardinality of a subset A C Z/nZ, the number of quadruples (oi, 02 , 03 , 04 ) e such that 
01 + 0-2 = 03 + 04 , at least when that cardinality is signihcantly greater than y/n. However, 
random sets do not work for progressions of length 3: the standard method of choosing 
points randomly with probability p, where p is chosen such that the expected number of 
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progressions of length 3 is at most half the expected nnmber of points, and then deleting 
a point from each progression, gives a lower bonnd of 5 = far smaller than the 

Behrend bonnd. 

The Behrend bonnd can be slightly improved when the progressions are longer, bnt 
for now let ns focns on progressions of length 3. What is the correct bonnd for the hrst 
non-trivial case of Szemeredi’s theorem? This is a fascinating qnestion that is still wide 
open, despite the attention of many mathematicians. However, there has been some very 
interesting progress. 

As mentioned earlier, the original argnment of Roth gave an npper bonnd of C (log log n)~^ 
This bonnd was improved to one of the form (logri)”'^ by Heath-Brown [23] and Szemeredi 
[38]. An important new techniqne, the use of regular Bohr sets, was introduced by Bour- 
gain in 1999 [6], to improve the constant c. More precisely, he obtained a bound of 
^(loglogn/logn)^/^. A difficulty with the problem is that cyclic groups are not rich in 
subgroups, so dropping down to a subgroup is not an option. Regular Bohr sets are a 
kind of substitute for subgroups, allowing Bourgain to get round this difficulty. They have 
subsequently been used in many other proofs. 

For a while, Bourgain’s result was seen as the limit of what could be achieved without 
a radical change of approach. It therefore came as a surprise in 2008 when Bourgain 
introduced an idea that allowed him to carry out the general scheme of his proof more 
efficiently and obtain a power of 2/3 instead of 1/2. Sanders [33] pushed this approach 
further and obtained a power of 3/4. 

Sanders followed up this improvement with a major advance on the problem [34]. He 
found an argument that was substantially different from Bourgain’s and used it to obtain 
a bound of C'(loglogn)®/ logn. Thus, he was tantalizingly close to the logarithmic barrier. 
In fact, even a bound of clog log n/ logn would be enough to prove purely combinatorially 
that the primes contain inhnitely many arithmetic progressions of length 3, since if m is a 
number with many small prime factors, then most arithmetic progressions with common 
difference m contain almost no primes, which means that some have a high density of 
primes. Working out the details, one can hnd arithmetic progressions of length n in which 
the primes have density clog logn/logn. 

2.2.2. What is the right hound for Roth’s theorem? That is where things stand today. Is 
the Behrend bound correct, or is Sanders’s upper bound close to optimal? Nobody knows, 
but there there are two recent results that give weakish evidence that the Behrend bound 
is more like the truth of the matter. 
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The first of these concerns a closely related problem about subsets of Fg (where Fg is the 
held with three elements). How large must a subset of Fg be to guarantee that it contains 
an afhne line, or equivalently three points x, y, z such that x + y + z = (Such a triple 
can also be thought of as an arithmetic progression, since ii x + y + z = 0, then 2y = x + z.) 

It was observed by Meshulam that Roth’s original argument works very cleanly in this 
context (the main reason being that, in contrast with the cyclic group Z/nZ, the group Fg 
is very rich in subgroups), and yields the following theorem [26]. 

Theorem 2.9. There exists a constant C such that every subset H C Fg of density at least 
C/n contains an affine line. 

Thus, in this context, we have a logarithmic bound (since n is logarithmic in the size, 3"^, 
of the set Fg). 

The gap between this and the best known lower bound is even more embarrassingly large 
than it is for Roth’s theorem, since the lower bound is of the form a"' for some constant 
a < 3. (To obtain such a lower bound, one hnds a low-dimensional example and takes 
powers of that example.) 

It was felt by many people that this was a better problem to attack than attempting to 
improve the bounds in Roth’s theorem, since working in the group Fg presented technical 
simplihcations without avoiding the deeper mathematical difficulties. And yet, despite the 
simplicity of the arguments for both the upper and lower bounds, for many years nobody 
could come up with any improvement. There was therefore considerable excitement in 2011 
when Bateman and Katz [3] broke the logarithmic barrier for this problem, improving the 
upper bound to C for a small but hxed positive e. Initially there was a hope that it 
might be possible to combine their ideas with those of Sanders to break the logarithmic 
barrier in Roth’s theorem as well, thereby proving the hrst non-trivial case of Erdos’s 
sum-of-reciprocals conjecture, but unfortunately good reasons emerged to suppose that 
this cannot be done without signihcant new ideas. However, the fact remains that the 
logarithmic barrier is not the right bound for the Fg version of the problem, which makes 
it hard to think of a good reason for its being the right bound for Roth’s theorem itself. 

The second recent result, also from 2011, makes it look as though a Behrend-type bound 
might be correct. Roth’s theorem can be thought of as a search for solutions to the equation 
X + z = 2y. Schoen and Shkredov, building on the methods that Sanders introduced to 
prove his near-logarithmic bound for Roth’s theorem, showed that if we generalize this 
equation, then we can obtain a much better bound [35]. 
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Theorem 2.10. Let A he a subset of {1,2,... ,n} o/density exp (—c(logn)^/® ^). Then A 
contains distinct elements xi,X 2 ,X 3 ,X 4 , x^ and y such that Xi + X 2 + x^ + x^ + x^ = %■ 

Note that the Behrend lower bound is easily adapted to this equation (since ii Xi,... ,x^,y 
are distinct and satisfy that equation then they cannot all lie on the surface of a sphere), 
so this result is within spitting distance of best possible. 

Of course, one could state an Erdos-like corollary to this theorem: if A is a set of integers 
such that Yhn&A diverges, then A contains a non-degenerate solution to the equation 
Xi + X 2 + x^ + X 4 + x^ = 5y. However, the original result is more natural. 

The result of Schoen and Shkredov is by no means conclusive evidence that the correct 
bound for Roth’s theorem is of the form exp(—(logn)^), since convolutions of three or more 
functions are signihcantly smoother than convolutions of two functions, a phenomenon that 
also explains why the twin-prime conjecture and Goldbach’s conjecture are much harder 
than Vinogradov’s three-primes theorem. However, one can at least say, in the light of 
this result and the result of Bateman and Katz, that there is a significant chance that the 
logarithmic barrier for Roth’s theorem will eventually be surpassed and the first non-trivial 
case of Erdos’s conjecture proved. 

2.2.3. Arithmetic progressions of length 4 or more. What happens for longer progressions? 
As mentioned earlier, the bounds coming from Szemeredi’s proof are very weak. Fursten- 
berg’s proof was inhnitary and gave no bound at all (though a discrete version of his 
argument was later found by Tao [39], which in principle gave a weak quantitative bound). 
The hrst argument to give a “reasonable” bound was the one in [16, 17], where the following 
theorem was proved. 

Theorem 2.11. Let A he a subset o/{l,2, .. .,n{ of density at least C(\og\ogn)~^'^ 
Then A contains an arithmetic progression of length k. 

Green and Tao subsequently improved the bound for k = 4 to exp(—c>/log log n) [20]. And 
that is the current state of the art, though for a hnite-held analogue of the problem (again 
with k = 4) they have a bound of the form exp(—(logn)'” [21]. 

Will Erdos’s sum-of-reciprocals conjecture be proved any time soon? There seems at 
least a fair chance that the case k = 3 will be established within, say, the next ten years. 
There are signihcant extra difficulties involved when the progressions are longer, but a sig- 
nihcant amount of technology for dealing with longer progressions has now been developed. 
Whether a bound for fc = 3 will lead to a bound for longer progressions probably depends 
a lot on what the proof for k = 3 looks like, and by how much it beats the logarithmic 
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bound. It may also depend on whether the inverse theorem for uniformity norms can be 
proved with good quantitative bounds. 

3. Erdos’s discrepancy problem 

Let us now turn to Conjecture 1.2. Discrepancy problems are problems that ask how 
“balanced” a colouring of a set can be with respect to some class of subsets. If we have a 
red/blue colouring k of a set X and A (Z X, then dehne the discrepancy disc(K, A) of k on 
A to be the difference between the number of red elements of A and the number of blue 
elements of A. The discrepancy disc(K, A) of k with respect to A is then max^gM disc(K, A). 
The discrepancy problem for A is the problem of determining the minimum of disc(K,^) 
over all 2-colourings k. We can of course think of k as a function from X to {—1,1} and 
then disc(«:, A) is | The Erdos discrepancy problem is the discrepancy problem 

for the set A of homogeneous arithmetic progressions: that is arithmetic progressions of 
the form (d, 2d, 3d,..., md). 

3.1. Known bonnds. As with Szemeredi’s theorem, it is tempting to conjecture, again 
wrongly, that random examples are best for this problem. If we choose a random sequence 
(ej) of Is and -Is, then the expected size of is around ^/n, and occasionally the 

size will be slightly bigger by a logarithmic factor. 

A simple example that gives rise to much slower growth of these sums is the following, 
observed by Borwein, Choi and Coons [5]. Every positive integer m can be written in a 
unique way as (3a ±1)3^ for integers a and b. We let = 1 if nr is of the form (3a -|- 1)3^ 
and —1 if m is of the form (3a — 1)3^. Note that this function is completely mulitplicative: 
emen = emn for any two positive integers m and n. Therefore, | Ylm=i ^rnd\ = \ed Z]m=i = 

I for any n and d, so analysing the example reduces to calculating the rate of 

growth of the partial sums of the sequence. 

To do this, we partition the integers from 1 to n according to the highest power of 3 that 
divides them. Let Af, „ be the set of multiples of 3^ that are at most n and are not multiples 
of3‘+‘. Then = 1 if in the ternary representation of n the digit corresponding 

to multiples of 3^ is 1, and 0 otherwise. It follows that YlZi=i ^ra is equal to the number 
of ternary digits of n that are equal to 1. In particular, it has magnitude at most logsU, 
which is far smaller than ^/n. 

In the light of that example, it is natural to investigate the following weakening of Erdos’s 
discrepancy conjecture, which Erdos also asked. 
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Conjecture 3.1. Let ei, € 2 , € 3 ,... be a completely multiplicative sequence taking values in 
the set { — 1,1}. Then the partial sums X]m=i are unbounded. 

Remarkably, this conjecture is also very much open. Later we shall discuss evidence that 
it may be more or less as hard as the discrepancy problem itself. 

What about the other direction? The sequence (1, —1, —1,1, —1,1,1, —1, —1,1,1) has 
length 11 and has discrepancy 1 (where by “discrepancy” we mean discrepancy with respect 
to the set of all homogeneous arithmetic progressions). This turns out to be the longest such 
sequence [25]. Surprisingly, the longest sequence with discrepancy 2 is much longer: there 
are a very large number of sequences of length 1124 with discrepancy 2, and it appears that 
this is the longest that such a sequence can be, though this has not yet been dehnitively 
proved. These experimental results, and almost all of the observations that follow, were 
discovered by the participants in Polymaths, an online collaboration that attacked the 
Erdos discrepancy problem in 2010 [29]. The fact that these sequences are so long gives 
one reason that the problem is so hard: it is difficult to imagine what a proof would be 
like that shows that the discrepancy of a ±1 sequence tends to inhnity with the length of 
the sequence, while failing to prove the false result that the discrepancy of a sequence of 
length 1000 is at least 3. 

That is not the only reason for the problem’s being hard. Another reason is that it is 
not easy to turn the problem into an analytic one - a technique that is extremely helpful 
for many other problems. It would be very nice if the result were true not because the 
sequence consists of Is and -Is but merely because it is large in some appropriate sense: 
for example, perhaps any sequence with values in [—1,1] such that the average magnitude 
of the terms is non-zero could be expanded in terms of some cleverly chosen orthonormal 
basis, and perhaps this would prove that its discrepancy was unbounded. But a very 
simple example appears to kill off this hope straight away: the discrepancy of the periodic 
sequence 1, —1, 0,1, —1, 0,... is 1, and yet the average magnitude of its terms is 2/3. Later 
we shall see that this example is not quite as problematic as it at hrst appears. Note that 
this example is a Dirichlet character: it is intriguing that the “difficult” examples we know 
of all seem to be built out of characters in simple ways. 

3.2. Variants of the conjecture. Sometimes, a good way of solving a problem is to 
replace the statement you are trying to prove by something stronger. There are several 
promising strengthenings of the Erdos discrepancy conjecture. An obvious one is to replace 
±l-valued sequences by sequences that take values in some more general set. The example 
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presented shows that we have to be a little careful about this, but the following conjecture 
is a reasonable one, and is also open. 

Conjecture 3.2. Let xi,X 2 , ■ ■ ■ be a sequence of unit vectors in a (real or complex) Hilbert 
space. Then for every C there exist n,d such that || Ylm=i^md\\ > C. 

Since M is a Hilbert space, this conjecture is a generalization of Erdos’s conjecture. A 
conjecture intermediate between the two is one where the Xi are complex numbers of 
modulus 1. 

A less obvious strengthening was formulated by Gil Kalai (one of the Polymaths partic¬ 
ipants), and called the “modular version” of the Erdos discrepancy problem. 

Conjecture 3.3. For every prime p there exists N such that if xi,X 2 , ■ ■ ■ ,xn is any se¬ 
quence of non-zero elements ofTj/pTj, then for every r G 'Ljp'L there exist n and d with 
nd < N and Ylm=i f fnod p. 

If we insist that each Xi is ±1 mod p, then the conjecture becomes obviously equivalent 
to the original Erdos problem. However, since the problem does not involve products of the 
Xi, there is nothing special about the numbers ±1, so in this context it becomes natural 
to replace the set { — 1,1} by the set of all non-zero elements. The motivation for this 
conjecture was the hope that the polynomial method might be applicable to it. So far this 
has not succeeded, but the modular version gives us a valuable new angle on the problem. 

A possible generalization of the modular version to composite moduli m would be to ask 
that the Xj are coprime to m (which is obviously a necessary condition if we want to be 
able to produce all numbers r). For amusement only, we state another conjecture here. It 
is similar in spirit to the more general modular version, but not quite the same. 

Conjecture 3.4. Let K be a finite set of irrational numbers and let Xi,X 2 ,... be a sequence 
of elements of K. Then the sums Sn,d = Ylm=i^md are dense mod 1. 

Note that the special case where K is of the form {a, —a] for an irrational number a is 
equivalent to the original discrepancy conjecture. It is not clear whether there are any 
logical relationships between Conjectures 3.3 and 3.4. 

3.3. Some approaches to the conjecture. Although the Erdos discrepancy problem 
looks very hard, there are some approaches that at least enable one to start thinking 
seriously about it. Here we discuss three of these approaches. 
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3.3.1. Completely multiplicative sequences. A close look at the very long sequences of dis¬ 
crepancy 2 that were produced experimentally reveals interesting multiplicative structure. 
The sequences are not completely multiplicative, but they appear to “want” to have mul¬ 
tiplicative features. For example, if you look at the values of a completely multiplicative 
±1 sequence along a geometric progression, then they will either be constant or alternat¬ 
ing. In the long sequences of discrepancy 2 we do not see that behaviour, but we do see 
quasiperiodic behaviour, at least for a while: towards the end, the patterns break down. 
There is a natural, but speculative, interpretation of this. The sequences appear to be 
some kind of “projection” to the set of ±1 sequences of highly structured sequences taking 
values in C. Towards the end, if the structure is followed too closely, the discrepancy rises 
to 3, but for a while that can be countered by simply switching the signs of a few terms in 
the sequence. If those terms correspond to integers with not many factors, then not many 
homogeneous progressions are affected, so one can extend the length of the sequence by 
sacrihcing the structure. But since it was the structure that allowed the sequence to get 
long in the hrst place, this process is eventually doomed: one has to make more and more 
ad hoc tweaks, and eventually it becomes impossible to continue. 

This picture suggests the following line of attack. Perhaps one could attempt to show 
that the worst examples - that is, the ones with lowest discrepancy - have to have some 
kind of multiplicative structure. Then one could attempt to prove the easier (one hopes) 
statement that a sequence with multiplicative structure must have unbounded discrepancy. 

An approach like this might seem a bit fanciful. Remarkably, however, there is a precise 
reduction from the Erdos discrepancy problem to a related problem about multiplicative 
sequences, discovered by Terence Tao (another Polymaths participant). With the help of 
a few lines of Fourier analysis, he proved the following result [30]. 

Proposition 3.5. Suppose that there exists an infinite ±1 sequence of discrepancy at most 
C. Then there exists a completely multiplicative sequence zi,Z 2 ,... of complex numbers 
of modulus 1 such that the averages N~^ ^2^=11 '221=1 bounded above by a constant 

depending on C. 

Thus, to prove the Erdos discrepancy problem, it is enough to prove the following con¬ 
jecture about completely multiplicative complex-valued sequences. 

Conjecture 3.6. There exists a function a; : N —?■ M tending to infinity with the following 
property. Let zi,Z 2 ,... be any completely multiplicative sequence zi,Z 2 ,... of complex 
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numbers of modulus 1. For each n let Sn be the nth partial sum of this sequence. Then 
(|sip + • • • + |sArp)/iV > u){N) for every N. 

This is not quite the same as saying that every completely multiplicative sequence has 
unbounded discrepancy, even if we generalize to the complex case. What it says is not 
just that the worst partial sums of such a sequence should be large, but that the average 
partial sums should be large (uniformly over all such sequences). However, if the weaker 
statement is true, then it looks likely that the stronger statement will be true as well. 

A pessimistic view of this reduction would be to say that it shows that the multiplica¬ 
tive problem is probably just as hard as the original. However, completely multiplicative 
sequences have so much more structure than arbitrary sequences that it is not clear that 
such pessimism is justihed. 

3.3.2. Semidefinite programming. The following very nice observation was made by Moses 
Charikar (yet another Polymaths participant), which offers a way round the obstacle that 
the sequence 1,—1,0,1,—1,0 ,... has bounded discrepancy. 

Proposition 3.7. Suppose that we can find non-negative coefficients Cm,d for each pair of 
natural numbers m and d, and a sequence {bn) such that J2md^rn,d = 1; = oo, and 

the real quadratic form 

^ ^ Cm,d{Xd T ^2d T ' ' ' T ^md) ^ ^ bn^n 

m^d n 

is positive semidefinite. Then every ±1 sequence has unbounded discrepancy. 

Proof. If {cn) is a ±1 sequence, then the positive semidehniteness of the quadratic form 
tells us that 

^ ^ Om,d{^d T ^2d T ' ' ‘ T (^md) F ^ ^ ^ ^ 

m,d n n 

Since d ^m,d = 1 and = oo, it follows that the sums ed + - ■ ■ + Cmd are unbounded. 

□ 

The same argument shows that if = C then there exist m,d such that \ed + 

^ 2 d + ■ ■ ■ + emd\ > It also proves the Hilbert-space version of the Erdos discrepancy 

conjecture, since if the Xi are vectors in a Hilbert space, then the non-negative dehniteness 
of the quadratic form implies that 

^ ^ Cm,d|lTci -\- X2d T ' ' ' T ^ ^ 

m^d 


n 
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is non-negative (as can be seen by expanding out the norms and looking at each coordinate). 

Less obviously, the existence of a quadratic form satisfying the conditions of Proposition 
3.7 is actually equivalent to a positive solution to the Hilbert-space version of the conjecture. 

Proposition 3.8. Suppose that every infinite sequence of unit vectors in a real Hilbert 
space has unbounded discrepancy. Then for every C there exists N, a set of non-negative 
coefficients Cm,d for each pair of natural numbers m and d with md < N, and a sequence 
(fei, ..., ^at) such that Ylim d ^rn,d = 1; J2n=i > C, and the real quadratic form 

^ ^ Om,d{,^d H X2d “I" ' ' ' "b ^ ^ l^n^n 

m,d n 

is positive semidefinite. 

Proof. For each m, d with md < N dehne to be the NxN matrix with ijth entry equal 
to 1 if both i and j belong to the arithmetic progression {d, 2d,..., md} and 0 otherwise. 
Then the conclusion tells us that there exists an iV x iV diagonal matrix with entries adding 
up to at least C that can be written as a convex combination of the matrices Ajn,d minus a 
positive semidehnite matrix. If this cannot be done, then by the Hahn-Banach separation 
theorem there must be a functional that separates the convex set of diagonal matrices with 
entries adding up to at least C from the convex set consisting of convex combinations of the 
Am,d minus positive semidehnite matrices. Let us regard this functional as an iV x iV matrix 
B in the inner product space that consists of all iV x iV matrices with square-summable 
entries and the obvious inner product. 

What properties must this matrix B have? We may suppose that {D, B) > 1 for every 
diagonal matrix with entries adding up to at least C and {A, B) < 1 whenever A is a 
convex combination of the matrices Ajn^ minus a positive semidehnite matrix. The hrst 
condition implies that B is constant on the diagonal and that the constant is at least C~^. 

The second condition implies that B has non-negative inner product with every positive 
semidehnite matrix, since if A were a counterexample, then we could make (—AA, B) 
arbitrarily large and positive by taking A sufhciently large and positive. In particular, if 
X G and we take A to be the positive semidehnite matrix x ® x (that is, the matrix 
with ijth element Xixfi), then {x,Bx) = {x®x,B) > 0, so B is itself positive semidehnite. 
This is well known to be equivalent to the assertion that there are vectors Ui, ..., uat in an 
inner product space such that Bij = {vi,Vj) for every i, j. Since B^ = c> C~^ for every i, 
we hnd that each vector Vi has norm y/c. 
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Finally, since the zero matrix is positive semidefinite, the second condition also implies 
that B must have inner product at most 1 with each Am,d- In terms of the vectors Uj, this 
is precisely the statement that llud + V 2 d + • • • + VmdW^ < 1, as can be seen by expanding 
the left-hand side. 

If we now rescale so that the Vi become unit vectors, this last inequality changes to 
||nd -|- V 2 d + • • • + TmdlP < K, for some constant K < C. 

Therefore, if the conclusion fails for some constant C, we can hnd, for each N a sequence 
of N unit vectors of discrepancy at most \fC. After applying a suitable rotation, we may 
assume that for each n the nth vector in this sequence is spanned by the hrst n standard 
basis vectors of Therefore, an easy compactness argument gives us an inhnite sequence 
of unit vectors with discrepancy at most a contradiction. □ 

Recall that the problem with the sequence 1, —1, 0,1, —1, 0,... is that it is “large” in 
a natural sense (namely having average magnitude bounded away from zero), but has 
bounded discrepancy. What Proposition 3.7 tells us is that there is a chance of proving 
that every sequence that is large with respect to a suitable weighted norm - the weighted 
^ 2 -norm with weights bn - has unbounded discrepancy. Thus, there is after all a way of 
making the problem analytic rather than purely combinatorial. 

What can we say about a set of weights that would work? The lesson of the troublesome 
1 , — 1 , 0 , 1 , — 1 , 0 ,... example is that the weights should be concentrated on numbers with 
many factors. For example, if the sum of the bn over all non-multiples of 3 is inhnite, 
then the weights cannot work, since then if (x„) is the troublesome sequence, we have 
J2n^nXn = oo and yet the discrepancy is hnite. (This does not contradict Proposition 3.7; 
it just means that for this choice of ( 6 „) we cannot hnd appropriate coefficients Cm,d-) 

It is not easy to write down a set of weights that has any chance of working - in fact, 
that is worth stating as an open problem - albeit not a wholly precise one. 

Problem 3.9. Find a system of weights {bn) with = oo for which it is reasonable 

to conjecture that every seguence {xn) such that J2n^nXn = oo has unbounded discrepancy. 

One of the things that makes Proposition 3.7 interesting is that it suggests a experimental 
line of attack on the Erdos discrepancy problem. First, one uses semidehnite programming 
to determine, for some large N, the sequence ( 6 i, • • •, with largest sum such that the 

diagonal matrix with those weights can be written as a convex combination of the matrices 
Am^d minus a positive semidehnite matrix. Next, one stares hard at the sequence and tries 
to spot enough patterns in it to make a guess at an inhnite sequence that would work. 
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Finally, one attempts to decompose the corresponding infinite diagonal matrix (perhaps 
using the experimental values of the coefficients Cm,d as a guide). 

Some efforts were made by Polymaths participants in this direction, but so far they 
have not succeeded. One problem is that cutting off sharply at N appears to introduce 
misleading “edge-effects”. But even if one finds ways of smoothing the cutoff, the experi¬ 
mental data is hard to interpret, though it certainly confirms the principle that the weights 
bn should be concentrated on positive integers n with many factors. Another serious dif¬ 
ficulty is that because we already know that there are very long sequences with small 
discrepancy, the matrices we find experimentally will have to be extremely large if they are 
to give us non-trivial lower bounds for discrepancy - large enough that the semidefinite 
programming algorithms take a long time to run. Despite these difficulties, this still seems 
like a promising approach that should be explored further. 

3.3.3. Representing diagonal matrices. We end by mentioning an approach based on an 
observation that is somewhat similar to Proposition 3.7 but that does not involve the 
slightly tricky concept of positive semidefiniteness. This approach was again one of the 
fruits of the Polymaths discussion. 

Let us define a HAP matrix to be a matrix A of the following form. Take two homoge¬ 
neous arithmetic progressions P and Q and define Aij to be 1 if i G P and j E Q and 0 
otherwise. In other words, a HAP matrix is the characteristic function of a product of two 
homogeneous arithmetic progressions. 

Proposition 3.10. Suppose that there exists an N x N diagonal matrix of trace at least 
C that belongs to the symmetric convex hull of all HAP matrices. Then every ±1 seguence 
of length N has discrepancy at least y/C. 

Proof. Let the diagonal matrix D have diagonal entries bi,... ,bp^ and suppose that it can 
be written as with | Aj| < 1 and with each Aj a HAP matrix. Let e = (ei,..., e^v) 

be a ±1 sequence. Then 

C < '^^bn€n = (e, Pe) = AiC) . 

n i 

It follows that there exists i such that |(e, Aje)| > C. If P and Q are the HAPs from which 
Ai is built, then 

(e,Aie) = , 

ieP 

which implies that at least one of modulus at least y/C. □ 
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Once again, the argument generalizes easily to unit vectors in a Hilbert space. And 
again there is an implication in the other direction. 

Proposition 3.11. Let C be a constant, let N be a positive integer, and suppose that 
for every N x N real matrix A = (aij) with Is on the diagonal there exist homogeneous 
arithmetic progressions P and Q such that \ '^j&Q Then there is a diagonal 

matrix of trace at least C that belongs to the symmetric convex hull of all HAP matrices. 

Proof. Again we use the Hahn-Banach theorem. If no such diagonal matrix exists, then 
there is a linear functional, which we can represent as taking the inner product with a 
matrix A, that separates diagonal matrices of trace at least C from convex combinations 
of HAP matrices and minus HAP matrices. If {D, A) > 1 for every diagonal matrix D of 
trace at least C, then A must be constant on the diagonal and the constant must be at 
least C~^. And if \{B,A)\ < 1 for every HAP matrix B, then for any two homogeneous 
arithmetic progressions P and Q we have | X]^gq I ^ choose A 

such that XA has Is along the diagonal, then the matrix XA contradicts our hypothesis. □ 

In the light of this proposition (which is easily seen to be an equivalence) it is natural to 
make the following conjecture, which is yet another strengthening of the Erdos discrepancy 
problem. 

Conjecture 3.12. For every C there exists N such that if A = (aij) is any real N x N 
matrix with Is on the diagonal, then there exist homogeneous arithmetic progressions P 
and Q such that \ '^jeQ 

If we apply that conjecture in the case where = CiCj for some ±1 sequence (ci,..., eAr), 
then the conclusion is that | J2iep '^j&Q from which it follows that the sequence 

has discrepancy at least \/C. Thus, the conjecture really is a strengthening of the Erdos 
discrepancy conjecture. Indeed, given how much weaker the condition of having Is on the 
diagonal is than the condition of being a tensor product of two ±1 sequences, it is a very 
considerable strengthening. And yet it still appears to have a good chance of being true. 

4. CONCLUSION 

The aim of this paper has been to give some idea of what is currently known about 
two notable conjectures of Erdos concerning arithmetic progressions. It has therefore been 
more about questions than answers, but Erdos would have been the last person to mind 
that. I imagine him sitting with “the book” open at the relevant page, smiling at us as we 
struggle to hnd the proofs that he is now able to enjoy. 
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